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Abstract. In this paper we study the problem of sorting under non-uniform comparison 
costs, where costs are either 1 or oo. If comparing a pair has an associated cost of oo 
then we say that such a pair cannot be compared (forbidden pairs). Along with the set 
of elements V the input to our problem is a graph G{V,E), whose edges represents the 
pairs that we can compare incurring an unit of cost. Given a graph with n vertices and 
q forbidden edges we propose the first non-trivial deterministic algorithm which makes 
0((q + n)logn) comparisons with a total complexity of 0(n^ -I- where oj is the 

exponent in the complexity of matrix multiplication. We also propose a simple randomized 
algorithm for the problem which makes 0{n^ fy/q + n + n^) probes with high probability. 
When the input graph is random we show that 0(min probes suffice, where p 

is the edge probability. 
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1 Introduction 

Comparison based sorting algorithms is one of the most studied area in theoretical computer 
science. However, the majority of the efforts have been focused on the uniform comparison cost 
model. Arbitrary non-uniform cost models can make trivial problems non-trivial, like finding 
the minimum W- Thus it makes sense to consider a more structured cost. For example, a 
common cost model is the monotonc0 cost model. As shown in [5] the best one can do is to get 
an algorithm that is within a logarithmic factor of a cost optimal algorithm. However, the l-oo 
cost model in this paper is not monotonic. This model has comparison cost of 1 or oo. A pair 
with cost oo is considered a “forbidden pair”. The set of pairs with comparison cost 1, defines an 
undirected graph, G{V, E), where V is the set of keys and E represents the allowed comparisons. 
We call G the comparison graph. Define Ef to be the set of forbidden pairs. Let \V\ = n and 
\Ef\=q. 

An example of a problem that uses this model is the nuts and bolts problem. This is strictly 
not a sorting problem rather a matching one. In this problem one is given two sets of elements, 
a set of nuts and a set bolts. Elements in each set have distinct sizes and for each nut it is 
guaranteed that there exists a unique bolt of same size. Matching is performed by comparing 
a nut with a bolt. However, pairs of nuts or pairs of bolts cannot be compared. So in this case 
G = K{N, B) is a complete bipartite graph with edges from the set of nuts N to the set of 
bolts B. This problem has been solved in the mid 1990s |71[g| . The existence of a 0(n log n) time 
deterministic algorithm was proved for it using the theory on bipartite expanders [7] . 

The problem of sorting with forbidden pairs is still open for the most part. It is closely related 
to the problem of partial sorting under a relation determining oracle. In this model we are given 
a set P of elements and a oracle Or which is used to determine the relations between pairs of 
elements in P. The goal is to determine all the valid relations. Number of queries made to Or is 
defined as the query complexity. Since there are 17(2" |H)j labelled posets with n elements, 
it immediately follows that the information theoretic bound (ITB) for the query complexity 
is This is has been investigated for width bounded posets in [T7], where the authors 

^ By monotone we mean that the cost of comparing a pair is a monotone function of the values of the 
pair. 
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show that if P has width at most w then the ITB for the query complexity is 0{{w + logn)n). 
They presented a query optimal algorithm for width bounded posets whose total complexity is 
Oinw^ log ■^). Their main contributions were on developing an efficient data structure which was 
use to store a poset as disjoint chains and queries unknown relations using a weighted binary 
search method. This algorithm can be generalized for any poset with an additional log w factor 
added to the the query complexity. Their results were the first major extension in this line of 
research after the seminal work by Faigle and Turan (TB] which showed existence of such an 
algorithm. Although an efficient implementation of it were not known at the time. Another 
similar problem is the local sorting problem. In this problem V is an ordered set and for each 
(m, v) G E we want to determine their relative order. The problem is to determine if this can be 
done without resorting to sorting the entire set V, since the ITB for this problem is i7(nlogZ\) 
in the standard comparison tree model (where A is the maximum degree of G). Currently no 
non-trivial deterministic algorithm is known for this problem. However, there is a randomized 
algorithm which makes optimal number of comparison with high probability Eni. 

The query model used in this paper differs from m in following manner: we don’t charge for 
checking whether an edge exists but we only charge for the comparisons made. The number of 
comparisons made or rather asked to the oracle is naturally defined as the comparison complexity 
or the probe complexity. However, no non-trivial ITB for the probe complexity is known in the 
standard decision tree model. We believe that the model is too weak for this purpose. For 
example, given a comparison graph G the number of different acyclic orientations of G gives 
an upper bound on the number of possible answers. Given the fact that it does not take any 
comparisons to identify G (up to isomorphism) and G has at most < -I- I) < n" P(T| 

number of acyclic orientations the ITB of 0(n log n) in the standard comparison tree model is 
too week for this problem. The matter is further complicated if one is also given the guarantee 
that the graph G is sortahle. We say G is sortable G can be totally sorted. This restriction further 
reduces the number of possible answers for graphs with small number of edges. For example if G 
has < n — 1 edges then we can determine the unique total order by just making one comparison. 
Since any acyclic orientation of the edges of G must give a hamiltonian path and G has < n — I 
edges, the edges must link consecutive vertices in the unknown order. A solitary probe is then 
used to determine the direction of this ordering. In this paper we take G to be arbitrary and not 
necessarily sortable. Hence by sorting G we mean determining the orientations of the edges of 
G such that the resulting partial order (which is unique) has the maximum number relations. 

In this paper we propose the first non-trivial deterministic algorithm under the probe com¬ 
plexity model as well as a randomized algorithm. The results are expressed in terms of n and 
q. Expressing the results in terms the number of forbidden edges fits naturally with the prob¬ 
lem. First of all q and w are related. Let Pq be the poset found after sorting G. We have 
9 > # of incomparable pairs in Pq > ( 2 )- Hence, w = 0{^Jq). Although we cannot directly 
compare the probe complexity used in this paper with the query complexity in m but it gives a 
better sense of the relatedness of the two models. Secondly, in the absence of any other structural 
properties of the input graph G, q gives a good indication of how difficult it is to sort G. For 
example, when q = O(logn), it is easy to see that one can sort in 0(nlogn) total time. To 
do this we pick an arbitrary pair of non-adjacent vertices and take out one of them, removing 
it from the graph. We do the same thing with the remaining graph until the graph remaining 
is a clique. It is clear that we had to take out at most O(logn) vertices. Then we sort this 
graph with 0(nlog(n)) comparisons and merge the vertices we had remove previously by prob¬ 
ing all the remaining undirected edges, which is at most 0(n log n). On the other extreme, if 
\E\ = ( 2 ) ~ 9 = then it can be shown that we need to make Q(\E\) probes to determine 

the partial order, since the complete bipartite graph K{A,B) with \A\ <C \B\ has many acyclic 
orientations [inillO]- So in this case one has to probe most of the allowed edges. 

In the context of randomized algorithms, this problem has been studied in PUS]. The authors 
in P proposed a randomized algorithm that sorts G with a probe complexity of 0(n^/^) with high 
probabilitj@. However their implementation uses as a sub-routine a poly-time uniform sampling 
algorithm to sample points from a convex polytope m- The authors did not discuss the exact 


By high probability we mean that the probability tends to 1 as n —>■ 00 . 
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bound on the total complexity in their paper. At each step the algorithm either finds a balancing 
or hnds a subset of elements that can be sorted quickly. For an arbitrary G it is not 
guaranteed that a balancing edge always exists. However, when G is the complete graph there 
always exists a balancing edge that reduces the number of linear extension at-least by a factor 
of 8/11 [I3]. 

1.1 Our Results 

The main contributions of this paper are as follows: 

— Given a comparison graph G we propose a deterministic algorithm that sort G with 0{{q + 

n) logn) probes. The total complexity of our algorithm is 0(n^ + where uj € [2,2.38] 

is the exponent in the complexity of matrix multiplication. We use only elementary methods 
in our algorithm. We start by finding a set of large enough cliques in G and use its elements 
to determine a good pivot. This is then applied recursively to induced subgraphs of G to 
generate a collection of partial orders. We then merge these partial orders in the final step. 

— We propose a randomized algorithm which sorts G with 0(n} j + q + n^/q) probes with 
high probability. We use a random graph model for this purpose. The method uses only 
elementary techniques and unlike in [T] has a total run time of 0{n^) in the worst case. 

— When G is a random graph with edge probability p we show that one can sort G with high 

probability using only 0(min probes. 

The rest of this paper is organized as follows: in section 1.2 we introduce some definitions and 
lemmas for later use. Section 2 details the proposed deterministic algorithm. In section 3 we 
introduce the randomized algorithm and its extension to random graphs. 


1.2 Definitions 

Recall G{V, E) is the input graph on the set V of elements to be sorted. A pair of vertices (u, v) 
can be compared if (u,v) € E, otherwise, we say the pair is forbidden and is in A/. The graph G 
is given to us by our adversary. Let Gi be the graph after Aedges have been oriented and Pi be 
the associated partial order. We denote the degree of a vertex v by d{v) and n{v) = n—1 — d{v) 
is the number of vertices that are not adjacent to v. The set of neighbors of a vertex v is denoted 
by N{v). We use the notation E{A,B) we denote the set of edges between the sets of vertices 
A,B<zV. We also define the little-o notation to remove any ambiguity from our exposition. 

Definition 1. If f{n) £ o{g{n)) then f{n) G 0{g{n)) but f{n) ^ 12{g{n)). 

Lemma 1. Let {/i(n),/ 2 (n),...,/^(n)} be a finite set of non-negative monotonically increasing 
functions in n such that: 

1. Vi fi{n) G o{g{n)) 

< cg{n) 

If F{n) = then E{n) G o{g'^{n)). 

Proof. See appendix. □ 

Lemma 2. Let T{n) = Tfui) + f{n) where < Sn for some 0 < 5 < 1 and f{n) G 

o(n^). Then, T{n) G o{n^). 

Proof. See appendix. □ 

® An edge in G revealing whose orientation is guaranteed to reduce the number of linear extension of 
the current partial order by a constant fraction. The pair of vertices incident to this edge is referred 
to as a balancing pair. 
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2 A Deterministic Algorithm For Restricted Sorting 

First we look at a simple case where q = 0(n). We will use some of the main ideas from 
this algorithm to extend it to the general case. This initial algorithm will have a worse probe 
complexity than the main algorithm. 

2.1 A Restricted Case 

Assume q < cn for some constant c. Let R = {v £ V \ n(v) > ci} for some constant ci. Then 
\R\ < (2c/ci)n. We choose ci = 4c. This is obvious from the fact that — 2cn. Let 

S = V \ R and G[5'] be the induced subgraph generated by S. We have [S'] > n/2 and if w G S' 
then n(v) < ci. 

Claim 1. There exists a subset X G S such that \X\ > n/2(4c+ 1) and G[X] is a complete 
graph. 

Proof. Let us construct X explicitly. We start with X = u, where u is an arbitrary vertex in S. 
We pick successive vertices from S iteratively. Let v be last vertex to be added to X. Since v 
has at least n — ci neighbors, whenever we pick a neighbor of v from S to add to X we loose at 
most Cl + 1 vertices (which include the vertex we picked). Hence if we pick neighbors of v the 
size of X is at least |S|/(ci + 1) > n/2(4c+ 1). 

□ 

Clearly the above procedure runs in 0(n^) time and makes no comparisons. Now we are ready 
to describe our algorithm. The main algorithm is recursive and we have two levels of recursion. 
We shall break the algorithm into several steps. 

Initial Sorting: Given the input graph G, let A be a clique, with |A| > n/2(4c+ 1) (Claim 
1). Let Y = V\X. Note that |T| < n — n/2(4c + 1) = (8c + l/8c + 2)n. Now we sort X using 
0(n log n) comparisons as G[X] is a complete graph. We can use a standard comparison based 
sorting algorithms for this purpose. Now we have two possibilities: 

Case 1: If |F| = o(njl, then we probe all edges of G\Y] and G[T, A], where G[A, A] is the induced 
bipartite graph generated by the sets Y and A. Then we take the transitive closure of the 
resulting relations, which does not need any additional probes. It can be easily seen that 
the number of probe made in the previous step is o(n^). For the sake of contradiction if we 
assume that it is not so then |A||y| + |Tp/2 > dn^ for some d. Which implies \Y\ > dn, 
since |A| + |F|/2 < n. But then, |y| = I7(n), which is not true according to our earlier 
assumption. So, in this case we would have sorted V by making only o(n^) probes. 

Case 2: Otherwise \Y\ > 5n, for some constant S. In this case we recursively partition Y based on 
elements from A. We call this the partition step. 

Partition step: We will recursively partition both A and Y. To keep track of the current 
partition depth we rename A to Ago and Y to Too- We pick rngo the median of Ago (after Agg 
is sorted). Since Agg C S we have n(TOgg) < ci. So mgg will be comparable to all but at most ci 
elements of Tgg. Let, 

Agg = {u G Tgg| V G A(mgg)} 

and Boo = ^oo \ Agg. Note |i?gg| < ci. Now let C7gg be the subset of Agg whose elements are 
> mgg and the set Lgg accounts for the rest of Agg \ mgg. Let Aig and An be the elements of 
Agg that are < and > to mgg respectively. We recursively partition the sets Ggg and Lgg using 
the medians of Aig and An.The R-sets are kept for later processing. We rename the sets f/gg 
and Lgg to Yig and Yn. So, the pairs (Aig, Yig) and (An, Yu) are processed as above generating 
the sets Aig, An, Rig and Rn. We continue doing this until the size of the A-set is < C 2 , where 
C 2 is some constant. At this point we don’t know the size of the Y-set paired with it. There are 
two cases we need to consider: 

^ Note that “| Y| = o(n)” is not an algorithmic test. We use it in this algorithm to establish a framework 
for the second algorithm, which uses a traditional test. 
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Case 1: |y | = o{n) Then we probe all the edges of G\Y] and G[X, Y] which uses at most C2|U| + 
number of comparisons. 

Case 2: |y| > 6n. Then we have |T| > 5n for constant S. Hence the graph G\Y] can have at most 
< {c/5)\Y\ missing edges. This satisfies our initial premise that the number of missing edges 
in G[Y] is linear in the number of vertices. Hence we can apply our initial strategy recursively. 
That is we first find a large enough clique (which according to Claim 3 must exist) and then 
use it to partition the rest of the set Y. 

Let us visualize using a partial recursion tree T (see Fig.l below). We shall call T the partial 
recursion tree for reasons soon to be clear. At the root we have the pair (Ago,Too). It has two 
children node (Aio, Tio) and (An, Yu) each having two children of their own and so on. Now at 
each level, the size of the A-set gets halved. So, the number of levels in T is at most O(logn). 
However, the Y-sets need not get divided with equal proportions. So, at the frontier (the deepest 
level) we will have nodes of the above two types, depending on the size of their corresponding Y- 
sets. Let the collection of these frontier nodes be partitioned in two sets ^ and W corresponding 
to case 1 and case 2 respectively. 

We can conclude that the total number of probes needed to compute all relations in is 
o(n^). This follows from Lemma 1. Here we can map the size of the Y-sets of the nodes in the 
collection ^ to the functions fi{n). We know that the total elements in the union of these Y-sets 
is < |lbo| < (8c-I- l/8c-|- 2)n. The total number of probes will be F{n) in worst case. What is 
the total number of probes on the internal nodes of T? We know that in the internal nodes we 
compare the median of the A-set with the elements of the A-set, which takes |A| probes. Since 
union of these A-sets cannot exceed the total number of vertices in G{n), at each level of T we 
do at most 0{n) probes, totaling to 0(71 log n) probes over all the internal nodes. 

Unlike the nodes in the nodes in F recursively calls the initial strategy using the input 
graph G[Y]. Let the probe complexity of our initial strategy be Q{n). Then the recursion for Q 
is as follows: 

m 

Q{n) = ^ Q{ni) + o{r?) 

Here we assume that the nodes in W are indexed according to some arbitrary order. We can solve 
this recurrence using Lemma 2 giving Q{n) G o(ri^), since — (8c-|- l/8c-|- 2)n. Note here 

that I'f'l is bounded by a constant since the size of Y-sets are in I7(n). 

We call T the full tree. All leaf nodes in T are in <l>. It is straightforward to show that T has 
0(log^ n) levels. Since any of the leaf nodes of T has |Y| < (3n (where /3 = (8c -I- l/8c -I- 2)), its 
subtree in T can have at most a\og/3n = alogn — a/3 levels, and any of its leaves having at 
most a log 71 — 2a/3 levels and so on for some constant a. 


Merge step: Once we have completed building T we proceed with the final stage of our algo¬ 
rithm. Recall that during the forward partition step we had generated many of these H-sets in 
the internal nodes of T. Now we start from the leaves of T and proceed upwards. Each pair of 
leaf nodes I, r sharing a common parent p, sends a partial order to it them (computed as in case 
1). When we merge this two partial orders we know that no extra probe is needed since they 
have already been split by the median of the A-set of p. What remains is to probe all edges 
between the B-set in p and elements in this partial order (which constitutes the set of elements 
A U A of the node p) as well as the edges in G[B]. Then we pass the resulting partial order to 
the parent of p, and so on. Since the size of the H-sets are bounded by ci (at any level in T), 
total number of probes we make is then < ci + '^i)- The sum is taken over all the 

nodes in that level. Hence this is bounded by citi, so at each level we do at most 0{n) probes 
in the backward merging step. Since there are at most 0(log^ n) levels, it totals to Oijilog^ n) 
additional probes. Adding this to the probe cost of partitioning in the forward step does not 
effect the total probe complexity, which was o(n^). The final step is to compute the transitive 
closure of the resulting set of relations, which can be done without any additional probing. Since 
computing the transitive closure is equivalent to boolean matrix multiplication m the total 
complexity is 0{vB). 
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Fig. 1. Visualizing the steps. At the bottom of T the shaded boxes represents the ^-nodes and the blue 
rectangles the !f"-nodes. The outer dashed triangle represents the full tree T. The tree T is created during 
the partitioning step and in the merge step we start from the deepest leaves of T and move upwards. 


2.2 The General Case 

We will define the sets R and S analogously to section 2.1. We have, R = {v € V \ n{v) > 
ciq/n} for some constant ci. With Ci = 4, we get \R\ < Sin where < 2/ci = 1/2. Hence 
l^l > (1 — Si)n > nl2. Now we will apply Claim 1 successively to construct a “big-enough” set 
X (Z S which we will use to find an approximate median of V. This set X consists of disjoint 
subsets Xi such that G[Xi] is a clique. 


Constructing X-. Let us define St = S \ U}=i ^j- We construct the first clique C S' using 
the method detailed in Claim 1. There are two cases: 

Case 1 q < n: In this case we can show that |Xi| > {nf2)f{ciqln -b 1) > n/10. We take the first 
n/10 elements and keep the rest for the second round. Now we construct the second clique 
X 2 from Si which has at least 2n(25 vertices. We let X = Xi U X 2 . Hence X has at least 
9n/50 vertices. 

Case 2 q > n: In this case we have |Xi| > {nf2)f{ciqfn -b 1) > n^/lOg. Again we take |Ai| = 
(l/10)n^/g discarding some vertices if necessary. Similarly we construct A2 C Si. It can be 
shown that IA2I > {n'^/10q){l — n/5q) and we keep (n^/10q)(l — n/5q) vertices in X 2 and 
the rest are discarded to be processed the next round. In general for the clique W we 
have \Xr\ > (n^/10q)(l — n/5qY~^. Now we let X = lJi=i ^i- We will show that |A| > S 2 n 
for some constant S 2 > 0. We let r = hq/n -b 1. Then we have 


\Xr\ > (n^/10g)(l — n/5qY ^ > irY /\Qq)(\ — > Sn^/lOOg 

since q > n. Hence, |A| = > (9/50)n, giving 62 = 9/50. Now for each Xi 

{1 < i < r) we keep a subset Yi of size |W| and throw away the rest. Clearly, for each i, the 
induced sub-graph G\Yi\ is also a clique. Let Y = Ui=i Y- We also have |y| > (9/50)n. 

Computing an approximate median of V : We shall compute an approximate median with 
respect to all the vertices (the set V) and not just the set S. We will find a median element that 
divides the set V in constant proportions. This can be done easily using the set Y. For each Yi we 
find its median using 0(11/1) probes since Gjl/] is a complete graph. Let this median be nn and 
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M be the set of these r medians. Since rui G S, n{mi) < Aq/n. We define the upper set oim G M 
with respect to a set A C V {m may not be a member of A) as U{m,A) = {a G A \ a > m}. 
Similarly we define the lower set L{m,A). We want to compute the sets U{m,Y) and L{m,Y). 
However, m may not be neighbors of all the elements in Y. So we compute approximate upper 
and lower sets by probing all the edges in E{{m}, Y \ {to}). These sets are denoted by U{m, Y) 
and L{m,Y) respectively. It is easy to see that there exists some m G M which divides Y into 
sets of roughly equal sizes (their sizes are a constant factor of each other). In fact the median of 
M is such an element. However the elements in M may not all be neighbors of each other hence 
we will approximate to using the ranks of the elements in M with respect to the set Y (which 
is |L(TO,y)|). Next we prove that the element m* is an approximate median of M, picked using 
the above procedure, is also an approximate median of Y. 

Claim 2. The element to* picked as described above is an approximate median ofY. 

Proof. First we show that the median of M is an approximate median of Y. This can be 
easily verified. Let us take the elements in M in sorted order (wi,..., TOj.), so the median 
of M is myri 2 \- Now L(TO[r/ 2 jW) = ySilli'' L{mi,Yi). Since, the sets Yi are disjoint and 
L{mi,Yi) > |W|/2, we have \L{myr/ 2 ],Y)\ > |Xr|r/4 (ignoring the floor). Similarly we can 
show that \U{mYj./ 2 \iY)\ > jX^jr/d. Hence TO|^r./ 2 j is an approximate median of Y. Now we 
show that I \L{m*,Y)\ — \L{mY,./ 2 \^Y)\ |< Aq/n. Consider the sorted order of elements in M 
according to |L(TO*,y)|. Since each element in to G M has at most Aq/n missing neighbors in 
Y, we have | |L(TO,y)| — \L{m,Y)\ |< Aq/n. So the rank of an element in the sorted order is at 
most Aq/n less than its actual rank. Thus an element m* picked as the median of M using its 
approximate rank |T(to, T)| cannot be more than Aq/n apart from TO|^^/ 2 J in th® sorted order of 
Y. Hence, 


\L{m\Y)\ > \Xr\r/A - Aq/n > 9n/200 - Aq/n > n/40 (1) 

whenever n^ > 2Q0q. In an identical manner we can show that \U{m*,Y)\ > n/40. Hence, to* is 
an approximate median of H. When q < n we just take to* as the median with the higher |L(-, y)| 
value, which guarantees |L(to*,T)| > n/40 whenever n^ > SOOg/lS. So we take n^ > 200g to 
cover both the cases. □ 

It immediately follows that to* is also an approximate median of V with both \L{m* ,V)\ and 
|{7 (to*,U)| lower bounded by n/40. Lastly, we note that the above process of computing an 
approximate median makes 0(q + n) probes. This follows from the fact that computing the 
medians makes 0(n) probes in total and for each of the < 5q/n + 1 medians we make 0(n) 
probes. 

A divide-and-conquer approach: Now that we have computed an approximate median of 

V we proceed with an recursive approach. Let m* be the median. As in section 3.1 we partition 

V into three sets U, L and B. The U and L are the upper and lower sets with respect to m*. 
B is the set of vertices that do not fall into either, that is, they are non-neighbors of to*. Since 
TO* G S' we have \B\ < Aq/n. We recursively proceed to partially sort the sets U and L with the 
corresponding graphs G\U\ and G[L\ and keep B for later processing (as we did in the merging 
step previously). Like before we can imagine a recursion tree T. Let Efp be the set forbidden 
edges in G[P\. We take np = |P| and qp = \Efp\. For each node P gT there are two cases: 

Case 1: When Up > 200qp, we recursively sort P. In this case we can guarantee that the approximate 
median m*p of P will satisfy equation (1). That is both \L{mp,P)\ and \U{mp,P)\ is > 
np/40. 

Case 2: Otherwise we probe all edges in G[P]. In this case P will become a leaf node in T. 

It can be easily seen that the depth of the recursion tree is bounded by O(logn) since at each 
internal node P of T we pass sets of constant proportions (where the size of the larger of the 
two set is upper bounded by (39/40)np) to its children nodes. 
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Merge Step: In this step we start with the leaves of T and proceed upwards. A parent node 
P gets two partial orders from its left and right children respectively. Then it probes all the 
edges between its B-set and these partial orders to generate a new partial order and pass it on 
to its own parent. This step works exactly as the “merge step” of the previous algorithm. Only 
difference is that the B-sets here may not be of constant size but of size < 4g/n. 

Probe Complexity: We can determine the probe complexity by looking at the recursion tree 
T. First we compute it for the forward partition step. At each internal node of T we compute a 
set of medians and pick one element from it appropriately chosen. Then we partition the set of 
elements at the node by probing all edges between the selected element and rest of the elements 
in the node. As mentioned before this only takes 0{qp+np) probes for some internal node P. We 
assume that all the leaves of T are at the same depth, otherwise we can insert internal dummy 
nodes and make it so. At each level of T the sum total of all the vertices in every node is < n and 
the sum total of the forbidden edges is < q. Hence we do 0{q + n) probes at any internal level of 
T. So for a total of O(logn) internal levels in T the number of probes done is {{q + n) logn) in 
the forward partition step. If P is a leaf node then we probe all edges in G\P]. There are at most 
(" 2 ^) ~ sdges in G[P]. Since P is a leaf node, according equation I, rip < 2QQqp. Hence we 
make (" 2 ^) — qp = 0{qp) probes. Summing this over all the leaves gives a total of 0{q) probes. 
Hence the total probe complexity during the forward step is 0{{q + n) logn). 

Now we look at the merging step. Merging happens only at the internal nodes. Lets look 
at an arbitrary internal level of T. At each node P of this level we probe all the edges in 
E{Bp, Up U Lp U rrip) and in G[Bp]. Note that we do not have to make any probes between U 
and L as they were already separated by the approximate median m*p. Hence the total number 
of probes made in this node is < (|Pp| + \Lp\ + \Bp\ + l)|Pp| < {np){4:qp/pp) < Aqp. Summing 
over all the nodes at any given level gives us 0{q) as the probe complexity per level. So the 
total probe complexity in the merging stage is 0{qlogn). Hence, combining the probes made 
during the partition step and the merge step we see that the total probes needed to sort V is 
0 {{q + n) logn). 


Total Complexity: Now we look at the total complexity of the previous procedure. Again the 
analysis is divided into forward step and the merge step. In the forward step at each node P we 
perform 0{np) operations. This includes computing the degrees, finding the cliques, computing 
the approximate median. So at any level of T, regardless of it being an internal level or not, we 
perform O(n^) operations. Hence it totals to 0{n? logn) operations in the forward step. However 
this is a conservative estimate and we can remove the log n factor as argued below: we can define 
the recurrence for the forward computation as. 


r(n) 


r(n/40) + T(39n/40) + 0{n^) > 200q 

0{q) Otherwise 


( 2 ) 


This follows from the previous discussion. If we don’t recurse on a node we guarantee that 
rip < 200gp for that node. Hence, we have T{n) = 0{n^ + q) using the Akra-Bazzi method [22]. 
In the merge step, we only make 0{qp) comparisons at any given node. We compute transitive 
closures only at the leaves. However for any leaf P we have rip < 200^^. Hence computing the 
transitive closure of G[P] takes 0{qp^^) time. Hence, the total complexity of the above procedure 
is 0 {n^ + q'^Gy We summarize the results in this section with the following theorem: 

Theorem 1. Given a graph G{V,E) of n vertices having q forbidden edges, one can compute 
the partial order of V with 0{{q + n) logn) comparisons and in total Oiri^ + g^G) time. 

Proof. Follows from the discussions in this section. □ 


3 A Randomized Algorithm 

In this section we look at a more direct way of sorting by making random probes. The proposed 
method is inspired by the literature on two-step oblivious parallel sorting (TOllII] algorithms, in 
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particular on a series of studies by Bollobas and Brightwell showing certain sparse graphs can be 
used to construct efficient sorting networks Him. It was shown that if a graph satisfies certain 
properties then probing its edges and taking the transitive closure of the resulting set would 
yield large number of relations. Then we just probe the remaining edges that are not oriented, 
which is guaranteed (with high probability) to be a “small” set. 

The main idea is as follows: Let be a collection of undirected graphs on n vertices having 
certain properties. A transitive orientation of a graph H{V, E) £ TLn is an ordering of V and the 
induced orientation of the edges of H based on that ordering. Let a be an ordering on V and 
P(iL, a) be the partial order generated by this ordering a on H. It is a partial order since H 
may not be sortable. Let V = P{H,a) and t{P) be the number of incomparable pairs in V. We 
want H to be such that t{p) is small. If that is the case then V will have many relations and if 
H is sparse then we can probe all the edges of H and afterwards we will be left with probing 
only a small number of pairs. These are pairs which were not oriented during the first round 
of probing and after the transitive closure computation. A graph iJ is useful to our purpose if 
every transitive orientation of results in many relations. We want to find a collection 'Hn such 
that every graph in it is useful with high probability. 

We extend the results in [5111^ to show that a collection of certain conditional random graphs 
are useful, with high probability. In our case this random graph will be a spanning subgraph of 
the input graph G. Here we recall an important result from |3] which we will use in our proof. 

Theorem 2 (Theorem 7 in [3]). If G is any graph on n vertices and G satisfies the following 
property: 

Ql: Any two subsets A, B of vertices having size I have at least one edge between them. 

Then, the number of incomparable pairs in P{G,a) is at most 0(nllogl) for any a. 

The input graph G is chosen by our adversary. However, we show that any random spanning 
subgraph of G with an appropriate edge probability will satisfy Ql with high probability. Let 
Hn,p{G) be a random spanning subgraph of G, where Hn,p{G) has the same vertex set as G and 
a pair of vertices in Hn,p{G) has an edge between them with probability p if they are adjacent 
in G, otherwise they are also non-adjacent in Hn^piG). All we need to prove is that any random 
spanning subgraph Hn,p{G) given G with n-vertices and edge probability p will satisfy Ql with 
high probability. Since G has at most q forbidden edges any two subsets of vertices A, B (not 
necessarily distinct) of size I must have at least ( 2 ) — q edge between them. Let, Eab be the 
event that the pair {A, B) is bad (they have no edges between them), then the probability Sn,p 
that there exists a bad pair is: 


Sn,p := P(^i^n.B,) <Y.nEA,B,) 




(3) 








where the sum is taken over all such (")^ pairs of subsets, and the number of edges between the 
two sets A and i? in G is e(A, B) > ( 2 ) — q. So we have, 

I (1—Since, 

(t) < exp(21(logen//) - p{ - q)) 


^ < 
•^n,p _ 


e-^ > 1 - X 


Hence Sn,p —>■ 0 as n —>■ 00 whenever exp(2/(logen/Z) — ^(( 2 ) — q)) = o(l). Given q < ( 2 ) it is 
always possible to find appropriate values for p and I as functions of q and n such that Sn,p = o(l)- 
Given some value for the pair {p,l), we see that in the first round we make 0{pn^) probes with 
high probability and in the second round 0{nl log 1) probes again with high probability. So the 
total probe complexity is 0{pn^ -\- nl). With some further algebra it can be shown that this is 
0(nf jy/q + n + n^/q). We summarize this section with the following theorem: 
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Theorem 3. Given a graph G on n vertices and q forbidden edges one can determine the partial 
order on G with high probability in two steps by probing only 0(r\f j^Jq + n + Uy/q) edges in total 
and in 0{n^) time. 

Proof. Follows from the preceding discussions. □ 

3.1 When G is a Random Graph 

The above technique can easily be extended for the case when the input graph is random. Let 
Gn,p be the input graph having n-vertices and an uniform edge probability p. For such a graph 
we can use equation (3) to bound Sn,p as follows: 


Sn,p < j (1 - P)*"" < exp(-p/^ + 21 log n) 

Hence, we can choose any I > 2\ogn/p such that Sn,p —>■ 0 as n —>■ oo. Let I = 31ogn/p. 
Using Theorem 2 we have t{Gn,p) = 0{nl) = Olnjp'). Since has pn^/2 edges (with high 
probability) the critical value of p when t{Gn,p) = pr?{2 is 0(l/i/n). Let this be p. Hence if 
p > p^ we can sort by making only comparisons. Since given Gn,p we can construct an 

induced subgraph Gn,p and use it as the random graph in our previous construction. Otherwise 
we just probe all the edges which makes 0{pn^) comparisons. Thus we can sort Gn,p with at most 
0(min comparisons with high probability. Hence, we get an elementary technique 

to sort a random graph with at most comparisons. The algorithm in [T] has a slightly 

better bound of comparisons. However, the total runtime of the algorithm in [Tj is only 

polynomially bounded when p is small. In our algorithm we need compute the transitive closure 
only twice making it run in 0 {n^) total time. 

Concluding Remarks 

In this paper we study the problem of sorting under non-uniform comparison costs, where costs 
are either 1 or oo. This cost structure is non-monotone resulting in additional complexity. The 
results presented here only uses elementary techniques, yet achieving non-trivial bounds on probe 
complexity. Further, we present strong evidence that the complexity of sorting V is dependent on 
certain properties of the input graph, in particular the number of forbidden edges q. We derive 
an non-trivial upper bound 0{{q -I- n) logn) for the probe complexity. The total complexity of 
our algorithm is bounded by 0{n^ +q^/‘^). Since the lower bound for the total complexity of the 
problem is module fast matrix multiplication, the proposed algorithm is almost optimal 

in terms of the total complexity. We also present a randomized algorithm for the problem which 
uses 0(nf j^q -V n + n^/q) probes with high probability. When the input graph is random this 
algorithm requires only probes again with high probability. 
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Appendix 

Proof of Lemma 1: 

Let, {/i(n),/ 2 (n),...,/fe(n)} be a finite set of non-negative monotonically increasing functions 

in n such that: 

1. Vi /dn) G o{g{n)) 

2- < cg{n) 

If F{n) = X;, ffin) Then F(n) G o{g^{n)). 

Proof. First we prove F{n) = 0{g{n)). Clearly, 

( 51 ^ cV(^) 

i 

F{n) < (?g‘^{n) 

Now we prove F{n) fi{g^{n)): assume that F{n) G Q{g^{n)), then there exists some 

constant c such that, F{n) > cg^{n) whenever n > ni. Now let fi{n) < Cig{n) whenever 
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^ no(ci). Since, fi{n) € o{g(n)) we can pick this Cj’s arbitrarily and independent of each 
other. Now, for n > max(ni,n 2 ) (where n 2 = maxi(no(ci))) we have. 


i 

c? > C 

i 

This contradicts the fact that Cj’s can be assigned arbitrary values independent of each other. 
That is, not all fi{n) will satisfy the condition fi{n) G o(g(n)) simultaneously. Hence, f (n) ^ 
f2(g^(n)). 

□ 


Proof of Lemma 2: 

Let, T(n) = T{ni) + f{n) where Yhi ^ some 0 < 5 < 1 and /(n) £ o(n^). Then, 

T(n) G o(n^). 

Proof. Let as assume T(n) = f?(n“) for some a > 1. Otherwise we are done. Hence, — 

T(^JLi m) = T{Sn). So, the recurrence becomes, T(n) < T{6n) + f{n). Using Master theorem 
we see that the case 3 applies here, which gives, T{n) = 6>(/(n)) = o{vf). □ 


